# Supplementary material for ICLR submission: Language Models are Graph Learners

## Main packages
1. pytorch
2. torch-geometric
3. pytz
4. scipy
5. numba (for PPR computing)
6. sentence-transformers (for GNN text encoding and semantic retriever), also transformers from HF will be installed automatically
7. accelerate (from HF)
8. ogb

## Prepare data for LMs
1. Download raw cora data from https://drive.google.com/file/d/1hxE0OPR7VLEHesr48WisynuoNMhXJbpl/view, or you can use the "process_raw_data/download.py" file
2. unzip it and put it under "raw_data" so the folder is "raw_data/cora_orig/...."
3. delete the downloaded zip file
4. move to "process_raw_data" and "python generate_LLM_text.py"
5. "python generate_ppr_neighbor_list.py"
6. "python generate_text_emb.py"
7. "python train_GNN.py" and get save models for each seed.
8. "python load_model_make_label_list.py" to make the predictions from the GNN
9. "python generate_prototype_text.py" to make the text for the prototypes

## Train LMs using the processed data and templates
1. move to the root folder
2. If we only use PPR retrieval, run "python t5_train.py". NOTE: the device is set in the file: e.g., os.environ["CUDA_VISIBLE_DEVICES"]="0"
3. If we use semantic retrieval, run "python t5_train_rag.py". NOTE: the device is set in the file: e.g., device = torch.device('cuda:2' if torch.cuda.is_available() else 'cpu')

## Others
1. We have some notations regarding the input_mode, output_mode. Please check the comments at the beginning of the files in the folder "templates"